9  Checking the linear link

Before this, we considered types of statistical analysis when it was necessary to compare average values ​​in several groups. The dependent variable was always quantitative (we compared its average value across groups), and the independent variable was categorical, took a finite number of values, and each of its values was a separate IV level, a separate group.

Now we turn to statistical tests that are used when both variables, DV and IV, are quantitative.

9.1 Correlation analysis

Correlation is a relationship between variables. Although it has the same name as one of the two types of relationships between variables, correlation can be identified using basically any type of analysis – after all, when we get the results of statistical tests, we only understand that two variables are related (or not), but we cannot conclude whether this is a cause-and-effect relationship or a correlation.

Here we will talk specifically about correlation analysis – a special type of analysis for determining the significance of a linear relationship between only two quantitative or ordinal variables.

To derive the formula and meaning of correlation, let’s get acquainted with the concept of covariance.

Co-variance is a measure of the co-variability of data, an indicator of how observations on two quantitative variables vary relative to each other.

pic from here

\(\text{cov}(x,y)=\frac{\sum_{i=1}^n (x_i - \bar x) (y_i - \bar y )}{n-1}\)

Shock content: try to calculate the covariance of a variable with itself and look at the resulting formula: does it remind you of anything?

Covariance with itself

\(\text{cov}(x,x)=\frac{\sum_{i=1}^n (x_i - \bar x) (x_i - \bar x )}{n-1} = \frac{\sum_{i=1}^n (x_i - \bar x )^2}{n-1}\)

This is variance!

The correlation coefficient is an indicator of the strength and direction of the relationship between variables. The magnitude of the number is responsible for the strength of the relationship, and the sign of the correlation is responsible for the direction. In essence, this is the covariance of variables, but weighted by the standard deviations of these variables. This is done in order to standardize the coefficient, move from absolute values ​​to relative ones, and place this coefficient within the limits of [-1;1]. For the Pearson correlation coefficient (correlation of two quantitative variables):

\(\text{corr}(x,y) = r_{xy} = \frac{\text{cov(x, y)}}{sd_x sd_y} = \frac{\sum_{i=1}^n (x_i - \bar x) (y_i - \bar y )}{(n-1)sd_x sd_y}\)

The coefficient of determination is an indicator of the extent to which the variability of the data is explained by this selected independent variable. If we have only one NP, then the coefficient of determination is practically the same as the correlation, only taken squared:

\(R^2 = r_{xy}^2 = \frac{\text{cov(x, y)}}{sd_x sd_y} = \frac{\sum_{i=1}^n (x_i - \bar x) (y_i - \bar y )}{(n-1)sd_x sd_y}\)

Example from the site https://rpsychologist.com/correlation/

Guess the Correlation Game: http://guessthecorrelation.com/

9.1.1 Correlation test

Hypotheses about the presence of a linear relationship between variables are tested using a correlation test. This is exactly the same statistical criterion as those we have already discussed. In essence, it is exactly the same as linear regression with one variable. The correlation test is used when both the PP and NP are quantitative variables or expressed in an ordinal scale (but not a nominative one). For a quantitative scale, the Pearson correlation coefficient is usually used, for an ordinal or quantitative variable with a small number of observations - the Spearman correlation coefficient.

The correlation test uses – you won’t believe it – the already familiar Student’s T-distribution! ( that is, we only need to remember two distributions: the T-distribution and the F-distribution )

The number of degrees of freedom is calculated using the formula

\(df = n - 2\), n – number of observations

,

Null and alternative hypotheses for the correlation test:

\(H_0\): \(r_{xy} = 0\)

\(H_1\): \(r_{xy} \neq 0\)

Like other criteria, it has assumptions.

9.1.2 Assumptions for the Correlation Test

(DV and IV are measured on a quantitative or ordinal scale)

  1. The distribution of NP by PP is linear - there is no picture of non-linear relationships or clusters of data in different places.

  2. The salary is normally distributed (not necessarily strictly consistent) and there are no noticeable outliers - this check was discussed here

Examples of what a nonlinear distribution might look like:

9.1.3 Nonparametric analogues

If the PP deviates greatly from the normal distribution, or the sample is small, or the PP is coded on an ordinal scale, the correlation test uses the Spearman correlation coefficient instead of the Pearson, and this is the only difference.

There is also Kendall’s tau, which is almost the same as Spearman’s correlation, but we will not consider it, since it is used extremely rarely.

9.1.4 Calculating the Correlation Test

Let’s test the following hypothesis.

studens evaluating their relationship with parents as less supportive (variable famrel, less supportive coded as 1-2), drink more alcohol (variable Walc, values 4-5)

student school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason guardian traveltime studytime failures schoolsup famsup paid_mat activities nursery higher internet romantic famrel freetime goout Dalc Walc health absences_mat G1_mat G2_mat G3_mat paid_por absences_por G1_por G2_por G3_por G_mat G_por absences_mat_groups absences_por_groups
id1 GP F 18 U GT3 A 4 4 at_home teacher course mother 2 2 0 yes no no no yes yes no no 4 3 4 1 1 3 6 5 6 6 no 4 0 11 11 5.666667 7.333333 middle less
id2 GP F 17 U GT3 T 1 1 at_home other course father 1 2 0 no yes no no no yes yes no 5 3 3 1 1 3 4 5 5 6 no 2 9 11 11 5.333333 10.333333 less less
id4 GP F 15 U GT3 T 4 2 health services home mother 1 3 0 no yes yes yes yes yes yes yes 3 2 2 1 1 5 2 15 14 15 no 0 14 14 14 14.666667 14.000000 less less
id5 GP F 16 U GT3 T 3 3 other other home father 1 2 0 no yes yes no yes yes no no 4 3 2 1 2 5 4 6 10 10 no 0 11 13 13 8.666667 12.333333 less less
id6 GP M 16 U LE3 T 4 3 services other reputation mother 1 2 0 no yes yes yes yes yes yes no 5 4 2 1 2 5 10 15 15 15 no 6 12 12 13 15.000000 12.333333 middle middle
id7 GP M 16 U LE3 T 2 2 other other home mother 1 2 0 no no no no yes yes yes no 4 4 4 1 1 3 0 12 12 11 no 0 13 12 13 11.666667 12.666667 less less
id8 GP F 17 U GT3 A 4 4 other teacher home mother 2 2 0 yes yes no no yes yes no no 4 1 4 1 1 1 6 6 5 6 no 2 10 13 13 5.666667 12.000000 middle less
id9 GP M 15 U LE3 A 3 2 services other home mother 1 2 0 no yes yes no yes yes yes no 4 2 2 1 1 1 0 16 18 19 no 0 15 16 17 17.666667 16.000000 less less
id10 GP M 15 U GT3 T 3 4 other other home mother 1 2 0 no yes yes yes yes yes yes no 5 5 1 1 1 5 0 14 15 15 no 0 12 12 13 14.666667 12.333333 less less
id11 GP F 15 U GT3 T 4 4 teacher health reputation mother 1 2 0 no yes yes no yes yes yes no 3 3 3 1 2 2 0 10 8 9 no 2 14 14 14 9.000000 14.000000 less less

Let’s also follow the algorithm.

DV – ordinal, IV – ordinal. Actualy, in this design they are not depedent and independent anymore – because this design is not experimental and do not assume causal relatioinhip. Our hypothesis is not about comparing groups with each other, but that these variables correlate, there is a linear relationship between them.

Since the both variables are ordinal, I need to use a nonparametric analogue of Pearson correlation - Spearman rank correlation (or ordinal logistic regression (if I want the relationship to have predictive power), but that’s not the topic of this article).

cor.test(students$famrel, students$Walc, method = 'spearman')

    Spearman's rank correlation rho

data:  students$famrel and students$Walc
S = 6173557, p-value = 0.0196
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
-0.130423 

If we had two quantitative variables, we would simply visualize them with a scatterplot with the now familiar line in the middle of the dots. For example, like this:

from here

But we have two ordinal variables, so the scatterplot will give an unclear variant. Therefore, we will use a mosaic plot: the size of the tile reflects the frequency of coincidence of such values ​​of two variables.

Another option is hitmap, where the dimensions are fixed and the color is responsible for the frequency of matches.

9.1.5 Interpretation of results

When we interpret the results of a correlation test, we are usually interested in the value of the statistic (the t-value or F-value), the p-value, and the effect size. For a correlation test, the value of the statistic is the t-value, but it is usually not the t-value that is used, but the correlation coefficient between the variables x and y \(r_xy\)– it is also the effect size, an indicator of the magnitude of differences. The correlation test is the only test where we do not need an additional metric about the effect size (for example, Cohen’s d), and we judge the strength of the differences by the coefficient itself.

In the example above, we got r = -0.13. In the correlation coefficient, we look at two parameters: the sign and the modulus of the number. Here we have a negative correlation, that is, the relationship will be inverse: with an increase in one variable (for example, the assessment of the quality of family relationships famrel), the second variable (frequency of alcohol consumption Walc) will decrease. 0.13 in modulus is a small number, this is a fairly weak correlation (you can check the breakdown by size in the section on effect sizes ).

It is important that with a very large sample, even a very weak correlation will reach statistical significance! Therefore, do not get carried away with correlation tests to find connections between everything and everything: you will definitely find it, and it will even be significant. As you can see, even r=0.1 can reach the threshold of statistical significance.

It is worth looking for a correlation between meaningful variables: since it can be significant with large samples, it may turn out that the number of films starring Nicolas Cage and the number of suicides by drowning are correlated - obviously, these values ​​are not related to each other, and the correlation here is random. You can look at strange correlations on the website https://tylervigen.com/view_correlation?id=12692

Another important point is that in a correlation test, even with a perfectly designed experiment, we will not be able to conclude a cause-and-effect relationship. But it is not the theoretical possibility of concluding a cause-and-effect relationship and the methods of statistical analysis that are important: like warm and red, they refer to different things. The possibility of concluding is determined by the design of the study, not the statistical test. If we have a well-conducted controlled experiment, and the 3 conditions for establishing a cause-and-effect relationship are met ( we discussed this here ), then we can conclude it. At the same time, the use of the same ANOVA may not be related to the experiment, and we will still conclude about the correlation (associative) relationship.

9.1.6 Correlation matrices

Correlation matrix analysis is often used – when correlations are calculated pairwise for each matrix of variables. This can be found, for example, in the correlation of questionnaires: let’s say there is questionnaire O1 and O2. Questionnaire O1 has subscales C11, C12, C13, C14, C15, and O2 has correspondingly C21, C22, C23, C24, C25. Then we can construct a correlation matrix for the subscales of these questionnaires.

9.2 Linear Regression Analysis

Linear regression analysis is exactly the same as the ANOVA (analysis of variance) we know, only if we replace categorical NP with quantitative ones!

Linear regression itself is a straight line that we try to draw through all our points in such a way that it captures the largest number of them. In essence, this is the same as correlation, only a more powerful tool - here we can enter several NPs.

Regression analysis is a pretty powerful thing, because here for the first time we start talking about the predictive function of the analysis. It turns out that regression analysis can be used:

  • To test hypotheses about the presence of a linear relationship between quantitative or ordinal variables

  • To predict salary values ​​beyond the available data

For now, we are interested in the first of these functions, although very often linear regression analysis is interesting precisely from the point of view of the second.

Regression analysis is based on the construction of a regression line: any line has the form \(y = kx + b\), in regression analysis this equation is often written as \(y = b_0 + b_1x\). And the task of regression analysis is to determine and test the coefficients \(b_0\) and \(b_1\) linear regression.

9.2.1 Regression coefficients

The equation of the regression line we have drawn is:

\(\hat y = b_o + b_1x\)

We see that most of the points do not fit the line perfectly - there is still some distance along the y-axis to the point itself. Therefore, if we write down the equation for each point using the regression line equation, it will look like this:

\(y = b_o + b_1x + e\)

The distance along the y-axis that remains to the points after we have drawn a straight line through them is called residuals – that is, these are the differences between the original data and those described by our model (line), what “remains”:

\(e = y - \hat y\)

Note that when we talk about the equation of a straight line, we denote y as \(\hat y\), and when we talk about actual points, we will simply denote it \(y\).

The regression line is often also called a model . The equation of the regression line with each new coefficient is a new model.

  • Coefficient \(b_1\) answers the slope of the line

  • Coefficient \(b_0\) is responsible for the displacement of the line along the y-axis (intercept)

The coefficients are calculated in such a way that the sum of the squares of the residuals is minimal. This is called the least squares method .

When constructing a regression line, we need to strive to reduce the sum of the residuals:

\(\sum_{i=1}^{n} e^2 = \sum_{i=1}^{n}(y - \hat y)^2\)

The formulas for the coefficients using the least squares method are equal to:

\(b_{1_{xy}} = \frac{sd_y}{sd_x} r_{xy}\)

\(b_o = \bar y - b_{1_{xy}}\bar x\)

When calculating the coefficients, the first one to be calculated is \(b_1\), and it, as can be seen from the formula, depends on the magnitude of the variability of the data for the variables x and y (standard deviations or dispersions). In the case of equal variability \(b_1\) is the correlation coefficient \(r_{xy}\)

9.2.2 Coefficient of determination and proportion of explained variability

In linear regression, as in ANOVA, the coefficient of determination tells us the percentage of variability explained , that is, how well our regression model explains the variability in the dependent variable.

As in ANOVA, the sum of squares SST consists of the between-group sum of squares (SSE, Sum of Squares Explained or SSB, Sum of Squares Between groups) and the within-group sum of squares (SSR, Sum of Squares Random or SSW, Sum of Squares Within groups).

\(SST = SSE + SSR\)

The total variability is calculated from a line with a mean value of y.

\(SST = \sum_{i=1}^n (\bar y - y_i)^2\)

From the picture you can see that

\(SSE = \sum_{i=1}^n (\bar y - \hat y_i)^2\)

Residual variability:

\(SSR = \sum_{i=1}^n (y_i - \hat y_i)^2\)

To evaluate how good the model is, we again resort to the coefficient of determination:

\(R^2 = \frac{SSE}{SST} = 1 - \frac{SSR}{SST}\)

The coefficient of determination can be thought of as the size of the effect – and it is nothing more than the already familiar \(\eta^2\)!

\(\eta^2 = \frac{SSE}{SST}\)

In linear regression analysis, the coefficient of determination is also considered as the degree of correlation between the initial values ​​of the variable \(y\) and predicted \(\hat y\). And as we remember, it is equal to the square of the correlation between the initial values ​​of the variable \(y\) and predicted \(\hat y\):

\(R^2 = r_{xy}^2 = \frac{\text{cov(x, y)}}{sd_x sd_y} = \frac{\sum_{i=1}^n (x_i - \bar x) (y_i - \bar y )}{(n-1)sd_x sd_y}\)

9.2.3 Regression analysis (testing regression coefficients)

Regression analysis is an interesting thing, as it consists of several layers that take something from ANOVA, and something from correlation analysis. Testing the significance of the coefficients is carried out on the basis of a criterion belonging to the T-distribution family, just like correlation analysis. And testing the entire model is carried out using the F-criterion, just like ANOVA. In regression analysis, we are more interested in testing the significance of the coefficients - since it is by the coefficients with which the factors in the model are taken that we determine whether the influence of these factors is significant.

The number of degrees of freedom is calculated using the formula:

\(df = n - 2\), n – number of observations

Model equation:

\(\hat y = b_o + b_1x\)

Null and alternative hypotheses:

\(H_0\): \(b_{1_{xy}} = 0\)

\(H_1\): \(b_{1_{xy}} \neq 0\)

The key statistic for the coefficients is the T-value, it is calculated using the formula:

\(T = \frac{b_1}{se}\)

9.2.4 Multiple Regression Analysis

Multiple regression analysis – implies the same thing, only new predictors appear (independent variables, also known as factors)

\(\hat y = b_o + b_1x_1 + b_2x_2 + ... + b_nx_n\)

9.2.5 Assumptions for Regression Analysis

(Variables are measured on a quantitative or ordinal scale)

  1. The distribution of variables is linear - there is no picture of non-linear relationships or clusters of data in different places.

  2. The residuals vary approximately equally along the entire line – homogeneity (or homoscedasticity) of the residuals. It is most often tested using a diagnostic scatter plot with the distribution of residuals by the predicted values ​​(fitted values)

  3. The residuals are normally distributed – the same as here , only for the residuals (probability density plot for residuals or QQ-plot)

  4. For multiple linear regression – absence of multicollinearity (strong correlation between independent variables). Checked using the VIF test (“variance inflation index”)

Examples of diagnostic graphs for residues: https://gallery.shinyapps.io/slr_diag/

9.2.6 Calculating Regression Analysis

When we run a regression analysis calculation, we end up with a table like this:

Designation Coefficient Statistic SE p-value
\(b_0\) Intercept \(t_{b0}\) \(SE_{b0}\) \(p_{b0}\)
\(b_1\) Коэф для фактора1 \(t_{b1}\) \(SE_{b1}\) \(p_{b1}\)
(если есть) \(b_2\) (если есть) Коэф для фактора2 \(t_{b2}\) \(SE_{b2}\) \(p_{b2}\)
(если есть) \(b_3\) (если есть) Коэф для фактора3 \(t_{b3}\) \(SE_{b3}\) \(p_{b3}\)

Just like everywhere else, we are primarily interested in the value of the statistics (t-value), the level of significance (p-value) and here we are also interested in the value of the coefficients themselves. In the case of significance (p-value < alpha), that is, obtaining the result that this factor significantly affects the variability of the data, and we can construct a regression line - we will write the equation of the regression line based on these values:

yy= Intercept + Coefficient for factor1 * Factor1 + Coefficient for factor2 * Factor2 + Coefficient for factor3 * Factor3

\(\hat y\) = Intercept + Coefficient for factor1 * Factor1 + Coefficient for factor2 * Factor2 + Coefficient for factor3 * Factor3

For example, let’s take another dataset with Udemy course information.

id title is_paid price num_subscribers avg_rating num_reviews num_comments num_lectures content_length_min published_time last_update_date category subcategory topic language course_url instructor_name instructor_url price_log num_subscribers_log
1039124 Start Finishing Your Projects TRUE 39.99 7047 4.625000 1151 220 41 129 2017-04-01 00:41:21 2019-01-07 Personal Development Personal Productivity Personal Productivity English /course/start-finishing-your-projects/ Charlie Gilkey /user/charlie-gilkey/ 3.688629 8.860357
1128200 Using Scrum to Complete Projects in your Client's Budget TRUE 19.99 11 3.750000 2 0 26 149 2017-05-02 22:06:57 2017-04-27 IT & Software Other IT & Software Scrum English /course/scrum-practices/ Beverly Reynolds /user/beverly-reynolds/ 2.995232 2.397895
760252 Learn US Politics with Film TRUE 109.99 245 3.950000 50 10 44 271 2016-02-25 17:31:21 2017-01-01 Teaching & Academics Social Science Political Science English /course/usapolitics/ Steven Ward, M.P.A. /user/stevenward6/ 4.700389 5.501258
1990786 Statistics Fundamentals and its Applications TRUE 19.99 11397 3.950000 540 119 37 197 2018-11-20 19:10:42 2020-03-01 Teaching & Academics Math Statistics English /course/statistics-fundamentals/ Amba Kumari /user/amba-kumari/ 2.995232 9.341105
3888974 Photoshop Elements 2021 TRUE 44.99 24 4.357143 7 2 117 547 2021-03-04 17:12:38 2021-03-03 Design Graphic Design & Illustration Photoshop Dutch /course/photoshop-elements-2021/ Martijn van Weeghel /user/martijn-van-weeghel/ 3.806440 3.178054
856424 SQL DBA For Beginners TRUE 124.99 302 3.650000 70 16 49 264 2016-05-25 14:46:09 2021-06-14 Development Database Design & Development SQL English /course/sql-dba-for-beginners/ Skill Tree /user/williamumusu/ 4.828234 5.710427
2816955 Microsoft Outlook Kurs für Einsteiger TRUE 199.99 1241 4.550000 50 14 36 264 2020-03-10 17:59:43 2022-10-07 Office Productivity Microsoft Microsoft Outlook German /course/microsoft-outlook-grundkurs/ Ben Polland /user/ben-polland/ 5.298267 7.123673
2271686 Ultimate Course for Stock Market Beginners - ZERO to HERO TRUE 19.99 3259 4.100000 87 22 56 402 2019-09-05 20:09:58 2019-09-03 Finance & Accounting Finance Stock Trading English /course/ultimate-course-for-stock-market-beginners-zero-to-hero/ Chetan Mirani /user/chetan-mirani/ 2.995232 8.089176
4142060 【Google Apps Script 超入門講座】2時間で知識ゼロから基礎をマスター|様々なアプリと連携して業務効率化 TRUE 3.00 1419 4.135416 226 19 43 121 2021-07-01 05:44:21 2022-06-19 Development Programming Languages Google Apps Script Japanese /course/googleappsscript/ 高幸 仲条 /user/zhong-tiao-gao-xing/ 1.098612 7.257708
3427978 Gerenciamento de manutenção Industrial - Parte 2 TRUE 99.90 18 4.300000 5 0 47 459 2020-10-30 16:36:39 2021-02-24 Business Industry Maintenance Management Portuguese /course/gerenciamento-de-manutencao-industrial-parte-2/ Lucenir Piovesan /user/lucenir-piovesan/ 4.604170 2.890372

And we will try to build a model of the cost of the course based on the number of students ( you may not have noticed, but here we are getting very close to the real problems that data analysts solve )

Spoiler: since this is real data, we had to tinker with its preprocessing, and even after that, the best option for building a model looks like this:

It is obvious that the model will not work well here. Let’s take another unsuccessful example for building regression models:

Therefore, we will discard the idea of ​​predicting the price and rating for now and move on to something more prosaic – we will build a model of the course duration from the number of lectures.

  1. Hypothesis: Duration content_length_minis determined by the number of lecturesnum_lectures

content_length_min ~ num_lectures

  1. We formulate the null hypothesis:

Coefficient \(b_1\) for num_lecturesshould not be equal to zero (that is, num_lecturesit affects the variability of the data)

\(H_0\): \(b_{1_{xy}} = 0\)

\(H_1\): \(b_{1_{xy}} \neq 0\)

  1. Let’s fix that we will test the hypothesis at the level \(\alpha = 0.05\)

  2. Let’s choose a statistical criterion for testing. Let’s see how linearly distributed the variables and residuals are:

Not the best option, but you can work with it (why? how is it different from the previous picture?)

  1. We build a regression model and conduct a regression analysis, look at the significance of the coefficients

Call:
lm(formula = udemy_model$content_length_min ~ udemy_model$num_lectures)

Residuals:
    Min      1Q  Median      3Q     Max 
-178.70  -46.37  -10.45   39.27  210.40 

Coefficients:
                         Estimate Std. Error t value            Pr(>|t|)    
(Intercept)               76.1720     5.2651   14.47 <0.0000000000000002 ***
udemy_model$num_lectures   2.2852     0.1913   11.95 <0.0000000000000002 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 62.67 on 497 degrees of freedom
Multiple R-squared:  0.223, Adjusted R-squared:  0.2215 
F-statistic: 142.7 on 1 and 497 DF,  p-value: < 0.00000000000000022
  1. Let’s interpret the results: what is the p-value for the coefficient num_lectures? We see that it is very small and clearly less than the declared alpha level - that is, the coefficient is significant, our hypothesis that the number of lessons determines the course length has been confirmed, hurray! What is the value of the coefficient itself? About 2.16. That is, with an increase in the number of lessons by 1, the course length will increase by 2.16 minutes! What is \(R^2\)? It is equal to 0.19, which is not very much in general, but it is already a result. That is, 19% of the variability of our data on the course duration is determined by the number of lessons!

I can now write the equation of the regression line as follows:

\(\hat content\_length\_min = 81.94 + 2.16 \times num\_lectures\)

Let’s do the same analysis, but taking into account several factors (predictors). Let’s assume that the course duration is also explained by the number of subscribersnum_subscribers

model_length2 <- lm(udemy_model$content_length_min ~ udemy_model$num_lectures + udemy_model$num_subscribers_log) 
summary(model_length2)

Call:
lm(formula = udemy_model$content_length_min ~ udemy_model$num_lectures + 
    udemy_model$num_subscribers_log)

Residuals:
    Min      1Q  Median      3Q     Max 
-179.07  -46.27  -10.45   39.03  210.61 

Coefficients:
                                Estimate Std. Error t value            Pr(>|t|)
(Intercept)                      76.7846     7.7452   9.914 <0.0000000000000002
udemy_model$num_lectures          2.2895     0.1957  11.700 <0.0000000000000002
udemy_model$num_subscribers_log  -0.1495     1.3847  -0.108               0.914
                                   
(Intercept)                     ***
udemy_model$num_lectures        ***
udemy_model$num_subscribers_log    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 62.73 on 496 degrees of freedom
Multiple R-squared:  0.2231,    Adjusted R-squared:  0.2199 
F-statistic:  71.2 on 2 and 496 DF,  p-value: < 0.00000000000000022

What can be said about these results? Are both coefficients significant?